Big data classification based on improved parallel k-nearest neighbor

نویسندگان

چکیده

In response to the rapid growth of many sorts information, highway data has continued evolve in direction big terms scale, type, and structure, exhibiting characteristics multi-source heterogeneous data. The k-nearest neighbor (KNN) join received a lot interest recent years due its wide range applications. Processing KNN joins is time-consuming inefficient quadratic structure method . As number applications dealing with vast amounts develops, get more sophisticated. authors seek save money on computer resources by leveraging large threads multiprocessors. Six popular datasets are used apply evaluate sequential parallel performance technique. These compare method. When compared matching multi-core solution, final implementation saves computing resources. It been optimized utilize as little RAM possible, allowing it manage high-resolution photo without sacrificing efficiency. will use technique they presented using Spark Radoop. Our research validates supplied method’s efficacy scalability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big Data Classification using Fuzzy K-Nearest Neighbor

Because of the massive increase in the size of the data it becomes troublesome to perform effective analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of dat...

متن کامل

k-Nearest Neighbor Classification on Spatial Data

Classification of spatial data streams is crucial, since the training dataset changes often. Building a new classifier each time can be very costly with most techniques. In this situation, k-nearest neighbor (KNN) classification is a very good choice, since no residual classifier needs to be built ahead of time. KNN is extremely simple to implement and lends itself to a wide variety of variatio...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

K-Nearest Neighbor Classification Using Anatomized Data

This paper analyzes k nearest neighbor classification with training data anonymized using anatomy. Anatomy preserves all data values, but introduces uncertainty in the mapping between identifying and sensitive values. We first study the theoretical effect of the anatomized training data on the k nearest neighbor error rate bounds, nearest neighbor convergence rate, and Bayesian error. We then v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: TELKOMNIKA Telecommunication Computing Electronics and Control

سال: 2023

ISSN: ['1693-6930', '2302-9293']

DOI: https://doi.org/10.12928/telkomnika.v21i1.24290